Adolescent Social Structure by by Jim Moody
Adolescent Social Structure by by Jim Moody
Adolescent Social Structure by by Jim Moody
The political blogosphere and the 2004 election: Divided they blog by Lada Adamic
Network Map of Protein-Protein Interactions by Erich E. Wanker of the Max Delbrück Center for Molecular Medicine (MDC)
BIT Formation
International Conflict Event Warning System (ICEWS): Material Conflict by Minhas, Hoff, & Ward
Relational data consists of
GLM: \(y_{ij} \sim \beta^{T} X_{ij} + e_{ij}\)
Networks typically show evidence against independence of {\(e_{ij} : i \neq j\)}
Not accounting for dependence can lead to:
We've been hearing this concern for decades now:
Thompson & Walker (1982)
Frank & Strauss (1986)
Kenny (1996)
Krackhardt (1998)
Beck et al. (1998)
Signorino (1999)
Li & Loken (2002)
Hoff and Ward (2004)
Snijders (2011)
Erikson et al. (2014)
Aronow et al. (2015)
Athey et al. (2016)
# library(devtools) ; devtools::install_github('s7minhas/amen')
library(amen) # Load additive and multiplicative effects pkg
data(IR90s) # Load trade data
Y[1:5,1:5] # Data organized in an adjacency matrix
## ARG AUL BEL BNG BRA ## ARG NA 0.05826891 0.2468601 0.03922071 1.76473080 ## AUL 0.0861777 NA 0.3784364 0.10436002 0.21511138 ## BEL 0.2700271 0.35065687 NA 0.01980263 0.39877612 ## BNG 0.0000000 0.01980263 0.1222176 NA 0.01980263 ## BRA 1.6937791 0.23901690 0.6205765 0.03922071 NA
# Reciprocity cor(c(Y), c(t(Y)), use='complete')
## [1] 0.9392867
# Reciprocity beyond nodal variation? senMean = apply(Y, 1, mean, na.rm=TRUE) recMean = apply(Y, 2, mean, na.rm=TRUE) globMean = mean(Y, na.rm=TRUE) resid <- Y - ( globMean + outer(senMean,recMean,"+")) cor(c(resid), c(t(resid)), use='complete')
## [1] 0.8591242
\[ \begin{aligned} y_{ij} &= \mu + e_{ij} \\ e_{ij} &= a_{i} + b_{j} + \epsilon_{ij} \\ \{ (a_{1}, b_{1}), \ldots, (a_{n}, b_{n}) \} &\sim N(0,\Sigma_{ab}) \\ \{ (\epsilon_{ij}, \epsilon_{ji}) : \; i \neq j\} &\sim N(0,\Sigma_{\epsilon}), \text{ where } \\ \Sigma_{ab} = \begin{pmatrix} \sigma_{a}^{2} & \sigma_{ab} \\ \sigma_{ab} & \sigma_{b}^2 \end{pmatrix} \;\;\;\;\; &\Sigma_{\epsilon} = \sigma_{\epsilon}^{2} \begin{pmatrix} 1 & \rho \\ \rho & 1 \end{pmatrix} \end{aligned} \]
\[ \begin{aligned} y_{i,j} = \beta_d^T \textbf{x}_{d,i,j} + \beta_r^T \textbf{x}_{r,i} +\beta_c^T \textbf{x}_{c,j} + a_i + b_j + \epsilon_{i,j} \end{aligned} \] Variables we might want to include:
(Hoff 2005; Westveld & Hoff 2010; Hoff et al. 2013; Fosdick & Hoff 2015; Minhas et al. 2016)
Threshold model: linking latent \(Z\) to \(Y\)
Social relations model: inducing network covariance
Gibbs Sampler for Bayesian estimation
An MCMC routine providing a fit to an additive and multiplicative effects (AME) regression model to relational data of various type
Arguments:
Nodal covariates should be structured as:
Xrow and XcolXn[1:10,]
## pop gdp polity ## ARG 3.548755 5.864710 7.18 ## AUL 2.895912 6.011414 10.00 ## BEL 2.314514 5.370685 10.00 ## BNG 4.789989 5.177956 5.00 ## BRA 5.070915 6.963597 8.00 ## CAN 3.377588 6.531009 10.00 ## CHN 7.091101 8.114522 -7.00 ## COL 3.652734 5.324862 7.82 ## EGY 4.063542 5.371521 -3.55 ## FRN 4.082272 7.101956 9.00
Dyadic covariates should be structured as:
Xd[1:3,1:3,]
conflicts
## ARG AUL BEL ## ARG NA 0 0 ## AUL 0 NA 0 ## BEL 0 0 NA
distance
## ARG AUL BEL ## ARG NA 11.72 11.31 ## AUL 11.72 NA 16.71 ## BEL 11.31 16.71 NA
shared_igos
## ARG AUL BEL ## ARG NA 3.83 3.92 ## AUL 3.83 NA 4.02 ## BEL 3.92 4.02 NA
fitSRM = ame(Y=Y,
Xdyad=Xd, # incorp dyadic covariates
Xrow=Xn, # incorp sender covariates
Xcol=Xcol, # incorp receiver covariates
symmetric=FALSE, # tell AME trade is directed
intercept=TRUE, # add an intercept
model='nrm', # model type
rvar=TRUE, # sender random effects (a)
cvar=TRUE, # receiver random effects (b)
dcor=TRUE, # dyadic correlation
R=0, # we'll get to this later
nscan=10000, burn=5000, odens=25,
plot=FALSE, print=FALSE, gof=TRUE
)
objects returned in fitSRM
names(fitSRM)
## [1] "BETA" "VC" "APM" "BPM" "U" "V" "UVPM" "EZ" "YPM" "GOF"
paramPlot(fitSRM$BETA)
grid.arrange( paramPlot(fitSRM$VC),
arrangeGrob( abPlot(fitSRM$APM, 'Sender Effects'),
abPlot(fitSRM$BPM, 'Receiver Effects') ), ncol=2 )
gofPlot(fitSRM$GOF, symmetric=FALSE)
Lets build on what we have so far and find an expression for \(\gamma\):
\[ y_{ij} \approx \beta^{T} X_{ij} + a_{i} + b_{j} + \gamma(u_{i},v_{j}) \]
(Holland et al. 1983; Nowicki & Snijders 2001; Rohe et al. 2011; Airoldi et al. 2013)
Each node \(i\) is a member of an (unknown) latent class:
\[ \textbf{u}_{i} \in \{1, \ldots, K \}, \; i \in \{1,\ldots, n\} \\ \] The probability of a tie between \(i\) and \(j\) is:
\[ Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) = \theta_{\textbf{u}_{i} \textbf{u}_{j}} \]
Software packages:
statnet (Handcock et al. 2016)blockmodels (Leger 2015)Newman (2006):
Adjectives and
Nouns
White & Murphy (2016): Mixed membership stochastic block model
(Hoff et al. 2002; Krivitsky et al. 2009; Sewell & Chen 2015)
Each node \(i\) has an unknown latent position
\[ \textbf{u}_{i} \in \mathbb{R}^{k} \]
The probability of a tie from \(i\) to \(j\) depends on the distance between them
\[ Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) = \theta - |\textbf{u}_{i} - \textbf{u}_{j}| \]
Software packages:
latentnet (Krivitsky et al. 2015)VBLPCM (Salter-Townshend 2015)Kirkland (2012): North Carolina Legislators
Kuh et al. (2015): Discerning
prey and
predators from food web
(Hoff 2003; Hoff 2007)
Each node \(i\) has an unknown latent factor
\[ \textbf{u}_{i} \in \mathbb{R}^{k} \]
The probability of a tie from \(i\) to \(j\) depends on their latent factors
\[ \begin{aligned} Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) =& \theta + \textbf{u}_{i}^{T} \Lambda \textbf{u}_{j} \, \text{, where} \\ &\Lambda \text{ is a } K \times K \text{ diagonal matrix} \end{aligned} \]
Software packages:
amen (Hoff et al. 2015)Multiplicative effects can be added by toggling the R input parameter
fitAME = ame(Y=Y,
Xdyad=Xd, # incorp dyadic covariates
Xrow=Xn, # incorp sender covariates
Xcol=Xcol, # incorp receiver covariates
symmetric=FALSE, # tell AME trade is directed
intercept=TRUE, # add an intercept
model='nrm', # model type
rvar=TRUE, # sender random effects (a)
cvar=TRUE, # receiver random effects (b)
dcor=TRUE, # dyadic correlation
R=2, # 2 dimensional multiplicative effects
nscan=10000, burn=25, odens=25,
plot=FALSE, print=FALSE, gof=TRUE
)
gofPlot(fitAME$GOF, symmetric=FALSE)
ggCirc(Y=Y, U=fitAME$U, V=fitAME$V)
Cranmer et al. (2017)
Out-of-sample Network Cross-Validation
Does AME actually reduce bias?
Hoff provides an argument that rests on exchangeability (Aldous, 1985)
Basis of simulation analysis
# Network simulation
simY = simulate.formula(network(n) ~ edges + edgecov(edgeVar) + networkTerm,
coef=c(
interceptValue,
dyadParamValue,
netParamValue
) )
# Run ergm
ergm(simY ~ edges + edgecov(edgeVar) + networkTerm)
# Run ame with and without multiplicative effects
ame(simY, Xdyad=edges, K=0)
ame(simY, Xdyad=edges, K=2)
Other things
Have been working with Hoff to:
Lots of cool stuff been/being done with this general framework:
LFM is not the end all be all model [duh], it's a powerful framework that has proven useful for some
These interdependent relations may at times be of interest themselves or in other cases may just help us to better predict